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Abstract. An overview is given of Bayesian inversion and regularization proce- 
dures. In particular, the conceptual basis of the maximum entropy method (MEM) 
is discussed, and extensions to positive/negative and complex data are highlighted. 
Other deconvolution methods are also discussed within the Bayesian context, focus- 
ing mainly on the comparison of Wiener filtering. Massive Inference and the Pixon 
method, using examples from both astronomical and non-astronomical applications. 



1 Introduction 

In the next few years there will exist all-sky datasets from two new satellite 
missions for the Cosmic Microwave Background (the MAP and Planck mis- 
sions), along with very large datasets from optical surveys such as 2dF and 
Sloan. The combined effect of these new data on quantitative cosmology will 
be enormous, but at the same time pose great problems in terms of the scale 
of data analysis effort required. 

As an example, the Planck Surveyor satellite, due for launch in 2007, 
combines both HEMT and bolometer technology in 10 frequency channels 
covering the range 30 GHz to 850 GHz, with a highest angular resolution of 
5 arcmin. An artist's impression of this satellite is shown in Figure |l|, and the 
experimental parameters of the Planck mission are summarized in Table |]. 
The mission is designed to give high sensitivity to CMB structures, together 



Table 1. Approximate experimental parameters of the Planck satellite. HFI refers 
to the high frequency part of the instrument, and LFI is the low frequency in- 
strument. The AT It sensitivity is per beam area in one year (thermodynamic 
temperature) 
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Fig. 1. Artist's impression of tlie Planck Satellite 

with sufficient frequency coverage to enable accurate separation of the non- 
CMB physical components. These will typically be Galactic dust, synchrotron 
and free-free emission, together with extragalatic radio and sub-mm/FIR 
sources. Also present will be the effects of Sunyaev-Zeldovich distortion of the 
CMB as it passes through the hot intracluster gas of clusters of galaxies. This 
separation of components must be performed using data from approximately 
100 detectors in total, spanning ten frequencies, and with the sky map at 
each frequency containing on the order of lO'^ pixels. These figures give some 
idea of the scale of the problem, for just this mission alone, and suggest why 
the idea of 'mining the sky' is appropriate. 

The task of analysing modern large datasets is undeniably challenging in 
terms of the amount of data to be processed. In the pursuit of 'precision cos- 
mology', however, we are faced with the additional requirement that the data 
must be analysed in a statistically rigorous way. In CMB observations, for 
example, one is interested in the statistical properties of CMB anisotropics, 
most commonly summarised by their power spectrum CV, from which it is 
possible to derive estimates and confidence limits on fundamental cosmologi- 
cal parameters such as the matter density of the Universe or the value of the 
cosmological constant. Similar statistical measures are central to the analysis 
of optical surveys. Thus, in modern cosmology, one is faced with the dual 
problem of analysing large datasets while retaining statistical rigour. In the 
present paper, we discuss both aspects, particularly in the context of how an 
efficient choice of 'basis functions' can lead to both an improved analysis and 
large speed-up factors. 

It is now generally accepted that the correct way to draw inferences from 
any set of data is to apply Bayes' theorem in a consistent and logical manner. 
This provides a general framework in which the analysis of CMB and optical 
survey data can be performed. Let us consider the generic problem at hand. 
In order to recover an underlying signal s from some measured data d, we 
commonly need to solve an inverse problem such as 

d = Rs + €, (1) 

where R represents the response matrix of the experiment and e is the instru- 
mental noise vector. For simplicity, we arc assuming here that the inversion 
problem is linear, although this is not strictly necessary. In any case, owing 
to the presence of noise, the properties of which are only known statistically 
(sometimes even this is not true), the inversion problem is degenerate. Even 
in absence of noise, a direct inversion would, in general, not be possible, since 
the response matrix R is normally not invertible. For instance, R may be a 
blurring (beam) function, which strongly suppresses higher spatial frequen- 
cies, or it may represent a beam-differencing experiment where some spatial 
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frequencies are actually set to zero. Thus, it is clear that some kind of statis- 
tical technique is needed in order to regularise the inversion. This naturally 
leads us to a Bayesian approach. This is one of the most powerful current 
techniques of image reconstruction. 

In the present paper, we discuss different deconvolution methods within 
the Bayesian framework, showing that different techniques are actually ob- 
tained by different choices of priors and/or basis functions. The outline of 
the paper is as follows. gives an introduction to Bayes' theorem and de- 
rives the Wiener filter in this context. §^ describes the Maximum Entropy 
Method (MEM), including extensions to positive/negative and complex data, 
and discusses some applications. The Pixon Method is introduced in §|[ ^ 
discusses multiscale and wavelet MEM. The Massive Inference technique is 
introduced in §|[ Finally, conclusions are given in 



2 Mining the Sky with Bayes' Theorem 

Let us recall the original problem 

d = Rs + e, (2) 

For simplicity we assume (s) = = (e) 

To obtain the 'best' sky reconstruction we chose to maximise the proba- 
bility Pr{s\d) using Bayes' theorem 

P<^\d) = j^^Pr{d\s)Pr{s), (3) 

where Pr{s\d) is the posterior probability of an underlying signal (or true 
sky) s given some data d, Pr{d\s) is the likelihood funtion and P{s) is the 
prior probability. At the first level of Bayesian inference Pr{d), the evidence, 
is merely a normalisation, which implies we wish to maximise 

Pr{s\d) (X Pr{d\s) Pr{s) (4) 

For convenience we consider the case of Gaussian noise, although this is 
not necessary (for instance there exist many applications to Poisson noise). 
For Gaussian noise, the likelihood is simply 

Pr(d|s) oc e-i^"^"^ = e-'^id-RsfN-^d^Rs) (5) 

where TV — (ee^) is the noise covariance matrix. This is usually written as 
Pr{d\s) oc exp(— ^x^). Now we have to decide on the assignment of the prior, 
Pr{s). As a first approach, we assume that s (which, for a CMB experiment, 
for example, would include CMB anisotropics, the Sunyaev-Zeldovich effect, 
Galactic emission, etc.) is a Gaussian random variable, described by a known 
covariance matrix C = {ss'^) (including all cross-correlations) so that 

Pr{s) (X e-^s^C-^'s (6) 
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In this case the posterior probabihty is 

Pr{s\d) oc Pr{d\s) Pr{s) oc e"^(x^ + s^C'^^) ^-^^ 

which one must maximise with respect to s to obtain the reconstruction. This 
is equivalent to minimising F = \{x^ + s^C~^s). In fact, we can do better 
than this. By completing the square in s (e.g. we can recover the whole 
posterior distribution: 

Pr(s|d) oc e"5(^-^)^-2;"'(^-^) (8) 
where the sky reconstruction s is given by 

s = Wd, 

W = (C-^ + R^N-^Ry^R^N-^, (9) 
where W is in fact the Wiener matrix and 

E = {C-^ + R'^N-^R)-^ (10) 

is the reconstruction error matrix E = ((s — s){s — s)"^). Thus we have 
recovered the optimal linear method, which is usually derived by minimising 
residual variances. 

We recall that in general the response matrix R will not be invertible. 
However, it is remarkable that the estimation of the sky s = Wd can still be 
computed no matter how singular R is, since it only needs R^ to be evaluated. 
This is an example of regularization. Notice how if the were not present 
in W we would just have W — R~^. We say that we have regularized the 
inverse. 

The above solution is 'easy' to calculate and has known reconstruction 
errors. It is, however, by no means the best solution in real problems. For 
instance, consider the standard 'Lena' IEEE test image in Fig. |. The original 
image (top left panel) is smoothed with a Gaussian blurring function with a 
FWHM of 6 pixels followed by the addition of noise (top right panel). The 
Wiener filter reconstructed image is given in the bottom right panel. Although 
some improvement is achieved, spurious structure ('ringing') appears at small 
scales. For comparison, a result generated using a pixon method (see §^) is 
also shown. 



Fig. 2. Comparison of the performance of a pixon method to the Wiener filter for 
the 'Lena' test image. The original image has been blurred with a Gaussian blurring 
function with a FWHM of 6 pixels followed by addition of noise 
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3 The McLximum Entropy Method 

The main shortcoming of the Wiener filter is that rehes on the assumption 
of Gaussianity and the a priori knowledge of the covariance matrix. Real 
data, however, is rarely so simple, and we must therefore consider alternative 
priors. A possible choice is the entropy prior (Maximum Entropy Method, 
MEM). 

Usually MEM is applied to positive, additive distibutions (PADS). Let h 
be the (true) pixel vector we are trying to estimate. In this case very general 
considerations of subset independence, coordinate invariance and system in- 
dependence lead uniquely to the prior Pr{h) oc e*^"^ where the 'entropy' S 
(||]) of the image is given by 

5(/i,m) =^ (^/i, -/i,ln(^^^^ (11) 

where m is the measure on an image space (the model) to which the image 
h defaults in the absence of data (it can be shown that the global maximum 
of S occurs at ft. = m). In fact, it has been shown recently ([^) that, if 
there exist linear constraints on the signal (e.g. like d — Rs + e in our case), 
the form of the entropic prior is determined uniquely by simply requiring 
consistency with the sum and product rules of probability. 

'Subset independence' implies, however, that no a priori correlations be- 
tween the pixels of ft should be present. So, is it possible to include known 
covariance structure, as in the Wiener method?. The answer is yes!. Given a 
sky s with C = {ss^), we form the Cholesky decomposition 

C = LL^ (12) 

where L is an upper triangular matrix, and define a hidden, uncorrelated i.i.d. 
(independent, identically distributed) unit variance hidden field h related to 
s by 

s = Lh (13) 

It is straightforward to show that, with this construction, {ss^) — C . Thus 
the derivation of 5(ft, m) applies to this hidden variable and wc need to 
maximise 

Pr{h\d) oc e-\x'{h.)+aS{h,m) 
where x^{h) = {d - RLhfN-^{d - RLh) (14) 

The vector ft that maximises this expression is the MEM reconstruction. 
Note that a is a regularising parameter of the relative weight of the data and 
the prior. Large a favours large entropy (i.e., ft close to m) at expense of 
the data, whereas small a gives more weight to the data. The parameter a 
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can be estimated itself via Bayesian methods (j2|). Crudely, the value of a is 
such that 

X'W«^, (15) 

where N is the number of good degrees of freedom in the data. Note that for 
a = 0, the method reduces to maximum likelihood. For a = 2 and small hi (in 
fact for hi < 3m, Vi) it can be shown that the method is Wiener filter again 
. This means that the Wiener filter is simply a quadratic approximation 
of MEM with a = 2. 

Another important issue is how to calculate the errors on the reconstruc- 
tion. This is performed by making a Gaussian approximation to the posterior 
probability distribution Pr{h\d) at its peak h. Moreover, by sampling from 
this distribution within, say, the Icr surface, one can generate sample recon- 
structions all compatible with the data, which can be very informative. 

An interesting application of MEM to astronomical data is the recovery 
of the projected mass density of a galaxy cluster from observations of its 
gravitational lensing effects on background galaxies ((5|). This technique is 
particularly interesting since it directly maps the dark matter halos in clus- 
ters. Moreover, together with the projected mass distribution, an estimation 
of errors is also obtained. Figure || shows the projected mass density of the 
cluster MS1054 reconstructed from shear data obtained by Q using the Hub- 
ble Space Telescope. 



Fig. 3. The shear field in the direction of the galaxy cluster MS1054, determined 
from HST observations, and the corresponding MEM reconstruction of the pro- 
jected mass density in the cluster (courtesy of Phil Marshall) 
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A further extension of MEM is necessary in order to apply the algorithm 
to positive and negative data (such as CMB) and also to complex data (e.g. 
Fourier transforms). Indeed, it is possible to generalise MEM to both of these 
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kinds of data. For a positive and negative image, we just need to write h as 
the difference between two positive images 

h ^ u V (16) 

Applying continuity constraints, we then obtain the entropic prior for posi- 
tive/negative images as 



S{h, m) — tpi — 2mi — hi In 



2to,- 



with ^i = \/hf + 4mf . (17) 

The posterior probability is given, as before, by exp(— + <^S)i but now 
using this generalised definition of entropy. This result can be derived directly 
from counting arguments ('monkeys throwing balls') 0. Regarding complex 
images, we can just treat real and imaginary parts separately: 

S{h, m) = S{^{h),di{m)) + S{^{h), S(m)) (18) 

where 3? and 5 denote the real and imaginary parts of each vector. This gen- 
eralisation to positive/negative and complex images is actually a key point, 
since MEM now can be applied to the Fourier Transform (or Spherical Trans- 
form for the all-sky case) of the original maps or images. But in Fourier space, 
modes at different k (or Z,m on a sphere) can generally be treated indepen- 
dently, therefore we can apply MEM separately at each mode. This means 
that we have Npix minimisations with respect to one or a few variables, in- 
stead of a single minimisation with respect to A^p^x variables. This leads to a 
huge speed-up in the algorithm, which is crucial for large data sets. 

We call this FastMEM or FourierMEM. This method has been successfully 
applied to reconstructing the different components of the microwave sky from 
simulated Planck data of small patches of the sky (Q). An application of 
FastMEM to Planck data is also given in this volume (Q). Moreover, an 
extension of the algorithm to deal with all-sky data, which works in spherical 
harmonic space, is currently being tested (H). Fig. H shows the performance 



Fig. 4. Results of MEM as applied to Planck simulated data on the whole sky. 
From top to bottom the maps correspond to input CMB, input dust and residuals 
for the MEM reconstructed CMB (from g) 



of this technique for simulated all-sky Planck data. The input CMB and 
Galactic dust maps are shown, but in addition the simulations also contain 
Galactic synchrotron and free-free emission as well as thermal and kinetic 
Sunyaev-Zeldovich effect from clusters of galaxies. The bottom panel shows 
the residuals in the MEM reconstruction of the CMB. It is striking that, even 
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when no cut of the Galactic plane has been attempted, no obvious emission 
from the Galaxy has contaminated the reconstructed CMB, except for a few 
pixels in the centre of the map. 

FastMEM has also performed very well in non-astronomical data. The 
left panel of Fig. ^ shows the blurred image (due to the instrument response) 
of a pollen grain obtained by combining 20 images taken at different depths 
with a confocal microscope. The reconstructed image achieved by FastMEM 
is shown in the right panel. The amount of detail recovered with respect 
to the original image is very noticiable. Besides, FastMEM takes around 45 
seconds to perform such a reconstruction versus 50 minutes needed by real 
spece MEM. 



Fig. 5. Blurred image of a pollen grain and FastMEM reconstruction 
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4 The Pixon Method 

A recent addition to the stable of image reconstruction algorithms is the pixon 
method ([0). The basic idea behind this technique is to minimise the number 
of degrees of freedom used to describe an image while still maintaining an 
acceptable fit to the data. This is achieved by, instead of working in the pixel 
basis, describing the image using 'pixons', which are essentially flexible pixels 
able to change shape and size. For example, in the pixon approach, only a 
few large pixons are needed to describe the background or parts of the image 
with a low signal to noise ratio, whereas a larger number of smaller pixons 
are used where the signal has more detail. 

The pixon idea can in fact be phrased within in the framework of Bayes' 
theorem via the introduction of a prior. Let there be N counts (e.g. photons) 
in total that must be assigned to n cells (or pixons) and assume Ni counts 
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end up in pixon i , so Ni = N . Thus, we need to choose n and Ni and also 
the position of the pixons in the least informative way. The total number of 
possibilities is given by . The total number having a given A^i in pixon 1, 
in pixon 2, etc. is , . So, the probability of a given arrangement is 

Pr(arrangement) = ^^jj j^ , (19) 

Note that this probability favours arrangements with a small number of 
pixons containing a large number of counts instead of having a large number 
of cells with only few counts. Indeed, it is maximised by n = 1 and Ni = N. 
So, we can use this probability as a prior combined with the likelihood term 
to obtain the posterior probability. Moreover, using Stirling's approximation, 
we can write this as 

Pr(arrangement) ~ ^ exp ^- ^ ^ In ^ j (20) 

which is similar to an entropy prior. Thus, the pixon method can be seen as 
a MEM that allows 'pixel sizes' (and shapes) to vary as well. 

Note that we have described the 'pure form' of the pixon method, but so 
far the commercial code has had to include a large number of modifications 
relative to this in order to get an algorithm that works properly and rapidly 
enough. An independent implementation of the Pixon method for cluster 
detection is given by in this volume. Indeed, the notion of distinct 'hard- 
edged' pixons of different shapes and sizes is unhelpful in the reconstruction 
of general images, and current pixon algorithms tend to favour a 'fuzzy- 
pixon' approach, which is equivalent simply to the assumption of an instrinsic 
correlation length for the structure in the image, which can vary across the 
image. Thus, the reconstructed image / is written as the local convolution of 
a pseudo- image /psoudo with a pixon shape function K , whose width varies 
over the image 

/(a;.) = ^ K(^^^-j^^Ipscudo{y)dVy, (21) 

where Xi is the location of pixel i and Si is the pixon size at pixel i. The 
pixon shape can be arbitrary (which can be a strength or a weakness of the 
method in different circumstances). A common choice is a truncated inverted 
paraboloid (jl^), which leads to (except for a normalisation) 

y-x,\<6. (22) 
y -Xi\> (5j 




The basic algorithm is very simple. Firstly, some initial choice is made for 
the pixon width Si in each pixel. Often this can be simply Si — 1 for all i, but 
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this can lead to 'freezing-in' of unwanted small-scale structure, so in some 
cases the 6i are chosen to be somewhat larger. In any case, given the initial 
choice of the Si the maximum-likelihood solution for the pseudo-image / is 
obtained in a standard manner. Then, keeping this pseudo-image fixed, the 
pixon widths Si are varied until a set is found where each Si has the largest 
possible value that is still consistent with the data in a least-squares sense. 
This whole two-step process is then repeated until convergence is achieved. 

Fig. 6. Comparison of the performance of the Pixon method to traditional 
MEM. Top row (from left to right): original image, blurred and noisy 
image, Pixon reconstructed image, MEM reconstruction. Centre row: sur- 
face plots for the images in top raw. Bottom row: blurring function, ad- 
ditive noise, residuals for Pixon Method, residuals for MEM . (Images from 
pttp : //casswww.ucsd. edu/personal/puetter/pixonpage .html ) 



Fig. ^ showed a comparison between the pixon method and the Wiener 
filter. We see that the pixon method clearly outperforms Wiener filter. An- 
other example is given in Fig. ^. In this synthetic image composed of a 
sharp peak and a valley is used to compare the pixon method and MEM. The 
image has been blurred and Gaussian noise added. We see that the recovery 
of the peak and valley are similar for both Pixon and MEM, but the low 
level noise present in the background of the MEM reconstruction has been 
successfully removed in the Pixon image. In addition, the residuals for the 
Pixon method seem compatible with random noise whereas MEM produces 
residuals correlated with the signal. 

5 Multiscale and wavelet MEM 

Although the reconstructions in Figure |^ show the pixon method to be very 
effective, the comparison is not strictly reasonable, since it employs the tra- 
ditional MEM technique, which is rarely still used in this simple form. 

It has long been realised that the key to effective image reconstruction is 
to reduce the number of degrees of freedom one is trying to constrain. The 
simplest way of achieving this goal is via the assumptions of an intrinsic cor- 
relation length that does not vary across the image ([|l3j); this was discussed 
briefly in §|[ Basically, one hypothesises the existence of a hidden space h that 
is linearly-related to the signal space s by an intrinsic correlation function L, 
such that 

s = Lh. (23) 

One then performs the MEM reconstruction in terms of h (which is a priori 
uncorrelated) . The corresponding signal reconstruction s will thus have an 
intrinsic correlation length determined by L and hence fewer independent 
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degrees of freedom. This clearly corresponds to the pixon method with all 
the pixon widths 6i being equal. 

However, this simple method can be greatly enhanced by choosing L in 
a more innovate way. The most obvious extension is to allow for the exis- 
tence of multiple hidden fields, each related to signal space by convolutions 
of different widths. This then reproduces the ability to have varying effective 
correlation lengths across the image, but in a such way that the the corre- 
lation length at each point in the image is determined via a proper entropic 
regularisation of the hidden fields, and not by an arbitrary least-squares cri- 
terion as is done in the pixon method. Application of this 'multiscale MEM' 
technique to numerous types of images has shown it to be very successful. An 
interesting astronomical example is again provided by gravitational lensing. 
Figure I?] shows the reconstruction of the projected mass density in the cluster 
MS1054 from the shear data shown in Figure H. In this case, however, the re- 
Fig. 7. The multiscale MEM reconstruction of the projected mass density in the 
cluster MS 1054 (courtesy of Phil Marshall) 
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construction has been performed using a 4-level multiscale MEM algorithm. 
By comparing with the traditional MEM reconstruction in Figure |3| one sees 
that the small scale rippling has disappeared, and indeed the calculated ev- 
idence for the 4-scale reconstruction is much higher. By way of illustration, 
in Figure |^ we also plot the corresponding hidden fields that constitute the 
reconstruction. 

One can see from figure || that the mutliscale MEM approach is equivalent 
to providing a set of (redundant, non-orthogonal) basis functions for the 
image that are simply the different intrinsic correlation functions. The MEM 
is simply obtaining a properly regularised optimal solution for the values of 
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Fig. 8. The hidden fields that constitute the multiscale MEM reconstruction in 
Figure ^ of MS 1054 (courtesy of Charhe McLachlan) 



the coefficients of each basis function required to reconstruct the image. Once 
viewed in this way, one may wonder if there exist more efficient sets of basis 
functions one could use to describe the image. Clearly, the number of degrees 
of freedom is simply equal to the number of basis functions required, and 
so one wishes to find a basis in which general images can be described with 
relatively few basis functions. The obvious choice is wavelets. These functions 
are constructed so that they are well-localised in both position and frequency 
space, and have proven to be very effective in representing an image with few 
basis functions (their extensive use in image compression is also obviously a 
result of this property). Indeed, by using a wavelet transform kernel to relate 
the spaces h and s, the reconstruction quality can be improved still further. 

6 Massive Inference 



Massive Inference (]1J],|15 ) can be seen as an even more extreme choice of 
basis functions. In this method, we throw away the underlying pixelisation 
grid and instead represent the object as a variable number of 'atoms' or 'point 
masses'. Each of these is described by a position Xj and a flux Zj. To simulate 
a continuum, Xj runs over 2^^ positions. We need to assign a prior probablity 
to X and z as well as to the number of atoms N . 

Each of the N locations is assigned a uniform prior, i.e., Pr{xi) — constant. 
The number of atoms can be assigned a Poisson distribution with a given 
mean a: 

e~"a^ 

Pr{N) = (24) 

Finally, each of the N amplitudes is assigned an exponential prior (with 
parameter q): 

Pr{z,) = —— (25) 

The program (using Markov Chain Monte Carlo sampling and simulated 
annealing) then samples the posterior probability (which also includes the 
likelihood term) treating a and q as hyperparameters. 

So far, some spectacular results have been obtained for 1-dimensional 
spectra and 2-dimensional point sources. An early application of Massive In- 
ference has been to flash photolysis data for proteins in corn grains, including 
a comparison with MEM. Figure || shows simulations of the decay of lumi- 
nescence measured in the experiment. The important question is how many 
decaying exponentials are present in these data, and what are their decay 
rates. In Figure |l^ the results obtained using Massinf and MEM are given. 
We see that Massinf is far more successful in determining a narrow range 
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Fig. 9. The decay of luminescence in a simulated flash photolysis experiment 




Fig. 10. The decaying exponential components present in Figure ^ as determined 
by Massinf (left) and MEM (right) respectively 



of possible decay rates. Most importantly, the Massinf algorithm can also 
provide the probability distribution for the number of distinct exponential 
components, as shown in Figure [TTI . 

Table ^ summarises the similarities and differences between MEM and 
Massive Inference. 



7 Conclusions 

We have seen that a Bayesian approach provides a common statistical frame- 
work for several important methods currently used in astronomical processing 
and analysis. By generalising one's view to include also the optimal choice of 
basis functions it is clear how significant improvements can be obtained both 
in the quality of the resulting reconstructions and in the speed at which the 
analysis can be carried out. These two aspects will both be crucial in the new 
era of quantitative cosmology which is now opening up. 
Acknowledgments 
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Fig. 11. The probability distribution for the number of distinct decaying exponen- 
tial components, as determined by Massinf 



Table 2. Comparison between of MEM and Massive Inference 



MEM 



Massinf 



Pixel based 
Gradient Search 
Needs differential R{a) 
and adjoint R{a) 
Poisson errors OK 
Multi-dimensional 

Gaussian approximation 
Fast global transforms OK 
Computing time cx grid size 



Continuum 
Marlcov Chain Monte Carlo 
Transform R{f) only 

only 
One-dimensional 
(needs Peano curve) 

Direct sampling 
Slow atom transforms 
Computing time oc number atoms 
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simulations shown in Figures 
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